Techniques for dynamic machine learning integration

ABSTRACT

Various embodiments are generally directed to techniques for dynamically integrating ML functionality into computing systems, such as a content services platform (CSP), for instance. Many embodiments include ML integrated into a CSP and using production content as corpora (e.g., training and/or evaluation data). Some embodiments are particularly directed to generating and updating data for training and evaluating machine learning (ML) models, then making identified ML models available in various target environments. For example, embodiments may provide automatic, or semi-automatic, updating and deploying of ML models for making inferences, such as inferring labels for data in a content repository of a CSP.

BACKGROUND

Machine learning is the study of computer algorithms that improveautomatically through experience. Typically, machine learning algorithmsbuild a model based on sample data, referred to as training data, inorder to make predictions or decisions without explicitly beingprogrammed to do so. Machine learning may utilize specialized softwareand/or hardware components that require integration to operate inconjunction with nonmachine learning software and/or hardware. Forexample, data must be sourced and prepared to generate training data. Inanother example, after a machine learning model is produced based ontraining data, the model needs to be deployed and hosted before use formaking predictions or decisions.

SUMMARY

This summary is not intended to identify only key or essential featuresof the described subject matter, nor is it intended to be used inisolation to determine the scope of the described subject matter. Thesubject matter should be understood by reference to appropriate portionsof the entire specification of this patent, any or all drawings, andeach claim.

In one aspect, the present disclosure relates to an apparatus comprisinga processor and a memory comprising instructions that when executed bythe processor cause the processor to perform one or more of obtainsource data from one or more data sources, the source data comprising aplurality of data items; produce training data based on the source data,wherein production of the training data includes removal of at least oneof the plurality of data items from the source data based on one or moredata metrics associated with a machine learning framework; provide thetraining data to the machine learning framework to generate a modelversion; receive the model version from the machine learning framework;deploy the model version to a target environment based on a statussetting corresponding to the model version; and integrate a productionrepository associated with the model version to the target environment.

In some embodiments, the target environment includes a content servicesplatform. In some such embodiments, the target environment includes acontent repository of the content services platform. In variousembodiments, the one or more data sources comprise at least a portion ofthe production repository. In many embodiments, the memory comprisesinstructions that when executed by the processor cause the processor togenerate a log of activity associated with one or more of obtaining thesource data from the one or more data sources, producing the trainingdata based on the source data, providing the training data to themachine learning framework, receiving the model version from the machinelearning framework, deploying the model version to the targetenvironment based on the status setting corresponding to the modelversion, and integrating the production repository associated with themodel version to the target. In several embodiments, each data itemincludes data and metadata corresponding to the data. In variousembodiments, production of the training data includes transformation ofat least one of the plurality of data items in the source data from aformat incompatible with the machine learning framework into a formatcompatible with the machine learning framework. In one or moreembodiments, the data metric comprises one or more of a file type, afile size threshold, a quality threshold, and a reliability indicator.In various embodiments, production of the training data includesnormalizing the source data.

In another aspect, the present disclosure relates to at least onenon-transitory computer-readable medium comprising a set of instructionsthat, in response to being executed by a processor circuit, cause theprocessor circuit to perform one or more of obtain source data from oneor more data sources, the source data comprising a plurality of dataitems; produce training data based on the source data, whereinproduction of the training data includes removal of at least one of theplurality of data items from the source data based on one or more datametrics associated with a machine learning framework; provide thetraining data to the machine learning framework to generate a modelversion; receive the model version from the machine learning framework;deploy the model version to a target environment based on a statussetting corresponding to the model version; and integrate a productionrepository associated with the model version to the target environment.

In various embodiments, the target environment includes a contentservices platform. In various such embodiments, the target environmentincludes a content repository of the content services platform. In manyembodiments, the one or more data sources comprises at least a portionof a production repository. Some embodiments comprise instructions that,in response to being executed by the processor circuit cause theprocessor circuit to generate a log of activity associated with one ormore of obtaining the source data from the one or more data sources,producing the training data based on the source data, providing thetraining data to the machine learning framework, receiving the modelversion from the machine learning framework, deploying the model versionto the target environment based on the status setting corresponding tothe model version, and integrating the production repository associatedwith the model version to the target. In one or more embodiments, eachdata item includes data and metadata corresponding to the data. Inseveral embodiments, production of the training data includestransformation of at least one of the plurality of data items in thesource data from a format incompatible with the machine learningframework into a format compatible with the machine learning framework.

In yet another aspect, the present disclosure relates to acomputer-implemented method comprising one or more of obtaining sourcedata from one or more data sources, the source data comprising aplurality of data items; producing training data based on the sourcedata, wherein production of the training data includes removal of atleast one of the plurality of data items from the source data based onone or more data metrics associated with a machine learning framework;providing the training data to the machine learning framework togenerate a model version; receiving the model version from the machinelearning framework; and deploying the model version to a targetenvironment based on a status setting corresponding to the modelversion.

In some embodiment, the computer-implemented method includes integratinga production repository associated with the model version to the targetenvironment. In various embodiments, production of the training dataincludes transforming at least one of the plurality of data items in thesource data from a format incompatible with the machine learningframework into a format compatible with the machine learning framework.In many embodiments, the computer-implemented method includes generatinga log of activity associated with one or more of obtaining the sourcedata from the one or more data sources, producing the training databased on the source data, providing the training data to the machinelearning framework, receiving the model version from the machinelearning framework, deploying the model version to the targetenvironment based on the status setting corresponding to the modelversion, and integrating the production repository associated with themodel version to the target.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates an exemplary operating environment for a machinelearning (ML) integrator according to one or more embodiments describedhereby.

FIG. 2 illustrates exemplary aspects of an ML integrator according toone or more embodiments described hereby.

FIG. 3 illustrates exemplary aspects of a content manager according toone or more embodiments described hereby.

FIG. 4 illustrates exemplary aspects of an implementation manageraccording to one or more embodiments described hereby.

FIG. 5 illustrates exemplary aspects of an audit system according to oneor more embodiments described hereby.

FIG. 6 illustrates an exemplary logic flow according to one or moreembodiments described here.

FIG. 7 illustrates exemplary aspects of a computing system according toone or more embodiments described hereby.

FIG. 8 illustrates exemplary aspects of a communications architectureaccording to one or more embodiments described hereby.

DETAILED DESCRIPTION

Various embodiments are generally directed to techniques for dynamicallyintegrating ML functionality into computing systems, such as a contentservices platform (CSP), for instance. Many embodiments include MLintegrated into a CSP and using production content as corpora (e.g.,training and/or evaluation data). Some embodiments are particularlydirected to generating and updating data for training and evaluatingmachine learning (ML) models, then making identified ML models availablein various target environments. For example, embodiments may provideautomatic, or semi-automatic, updating and deploying of ML models formaking inferences, such as inferring labels for data in a contentrepository of a CSP. These and other embodiments are described andclaimed.

Many challenges face machine learning integration. Developing an MLproject or use case implementation with a content repository requires anoverly complex and resource intensive amount of data engineering,infrastructure, and integration. For example, continually evolving dataas well as frequently changing data sources necessitates frequentupdates and changes to the process for obtaining and/or preparingupdated content (e.g., training data). In another example, continuallyevolving target environments with dynamic operating parametersnecessitates frequent updates and changes to the process for deploying,hosting, and/or applying ML models to data. In both examples, theupdates and changes typically require evaluation by multiple expertsthat specialize in different aspects of ML (e.g., a team of datascientists with cross-functional groups to work together, understandeach other and the dependencies). Further, more experts may be requiredto interpret and analyze predictions. This can make model/contentintegration unbearably resource intensive in terms of effort, cost, andtime.

Adding more complexity, integrating a production repository to where amodel is deployed can create various security, privacy, and dataengineering concerns. For instance, integration could lead to theexposure of sensitive data. Adding still more complexity, failing totrack historical configurations and settings complicates identifying,diagnosing, and/or repairing issues. For example, distinguishing dataproduced, or decisions made by ML models from human-generated data canbe difficult or impossible without adequate auditing. These and otherfactors can make model/content integration more challenging than the MLdevelopment to create an ML model, and result in expensive, complex, andinefficient integration of ML. Such limitations can drastically reducethe functionality while increasing the need for manual input and expertguidance, contributing to lost economies of scale, missed insights, andinefficient systems, devices, and techniques with limited capabilities.

Various embodiments described hereby include an ML integrator toautomatically source and transform all content (e.g., source data) intorenditions (e.g., training data) ready for input into an ML framework.Many embodiments continuously send new or modified content to theapplicable ML models. Models may be readily retrained or optimized(e.g., hyperparameter training) with the most recent content. Further,the new model versions can be governed in terms of active/inactive andtarget environments. Several embodiments include the ability toselectively publish models. For example, models may be published (e.g.,deployed and hosted) with labels, being active and available to one ormore target environments (e.g., corresponding environment instances).Multiple embodiments enable dynamic model generation and utilization,such as by training new model iterations with production content. One ormore of these features can enable use cases for content that is verydynamic and models that have short lifespans, such as due to newcategories and content is being added all the time.

Many embodiments automate and/or enable users to avoid issues with dataengineering (e.g., manual transformations, cleaning data, continuouscontent extraction, transformation, and loading), training (e.g., modelcreation, update, and publishing), and model usage and integration withCSP content structures and metadata.

users with an intuitive interface system that enable users to derivevalue from ML by providing a uniform user experience with intuitivefunctionalities to implement the predictive power of ML models. Theinterface system may guide users through the operation and configurationof various aspects of the ML integrator including analysis,interpretation, and/or resolution of associated data, issues, andresults. In another example, the interface system may allow a bulkaction to be run on lists of documents (e.g., from search, folder,collection) that asks one or more ML models for predictions. In suchexamples, the results may be filled in depending on thresholds that areconfigurable via the interface system. In yet another example, each timea form is accessed to create or edit content, suggestions for the newcontent may automatically be provided from model predictions. Manyembodiments allow bulk predictions to be applied in a safe way thatenables the scalability of AI to be tapped into.

Several embodiments include an audit system that tracks operation of theML integrator and/or components connected thereto. For example, thelifecycle, usage, and performance data of a model can be audited,logged, and made available for analysis. Further, automatic actions fromprediction may be backtracked to a previous state. In another example,information, such as visual indications, may be provided to distinguishcontent generated by ML modes from content generated by manual input.One or more of these components and/or techniques may be used as part ofa process for ML integration, resulting in more dynamic, efficient, andintuitive ML integration.

One or more techniques described hereby may facilitate the efficientintegration of dynamic ML models, leading to useful and previouslyunknown relationships between data objects being identified. In theseand other ways, components/techniques described here may identifymethods to increase efficiency, decrease performance costs, decreasecomputational cost, and/or reduce resource requirements to integrate MLin an accurate, reactive, efficient, dynamic, and scalable manner,resulting in several technical effects and advantages over conventionalcomputer technology, including increased capabilities and improvedadaptability. In various embodiments, one or more of the aspects,techniques, and/or components described hereby may be implemented in apractical application via one or more computing devices, and therebyprovide additional and useful functionality to the one or more computingdevices, resulting in more capable, better functioning, and improvedcomputing devices. For example, a practical application may includeintegrating ML with a CSP. Further, one or more of the aspects,techniques, and/or components described hereby may be utilized toimprove the technical fields of ML, ML integration, CSPs, and/or contentmanagement.

In several embodiments, components described hereby may provide specificand particular manners to enable inferences to be made using ML models.In many embodiments, one or more of the components described hereby maybe implemented as a set of rules that improve computer-relatedtechnology by allowing a function not previously performable by acomputer that enables an improved technological result to be achieved.For example, the function allowed may include one or more of thespecific and particular techniques disclosed hereby such as one or moreof continuously sending new or modified content to applicable ML models;readily retraining or optimizing ML models with the most recent content;governing model versions in terms of active/inactive and targetenvironments; enabling dynamic and automated model generation andutilization, such as by training new model iterations with productioncontent; guiding users through the operation and configuration ofvarious aspects of the ML integrator; allowing a bulk action to be runon lists of documents (e.g., from search, folder, collection) that asksone or more ML models for predictions; and auditing, logging, and makingavailable the lifecycle, usage, and performance data of ML models.

With general reference to notations and nomenclature used hereby, one ormore portions of the detailed description which follows may be presentedin terms of program procedures executed on a computer or network ofcomputers. These procedural descriptions and representations are used bythose skilled in the art to most effectively convey the substances oftheir work to others skilled in the art. A procedure is here, andgenerally, conceived to be a self-consistent sequence of operationsleading to a desired result. These operations are those requiringphysical manipulations of physical quantities. Usually, though notnecessarily, these quantities take the form of electrical, magnetic, oroptical signals capable of being stored, transferred, combined,compared, and otherwise manipulated. It proves convenient at times,principally for reasons of common usage, to refer to these signals asbits, values, elements, symbols, characters, terms, numbers, or thelike. It should be noted, however, that all of these and similar termsare to be associated with the appropriate physical quantities and aremerely convenient labels applied to those quantities.

Further, these manipulations are often referred to in terms, such asadding or comparing, which are commonly associated with mentaloperations performed by a human operator. However, no such capability ofa human operator is necessary, or desirable in most cases, in any of theoperations described hereby that form part of one or more embodiments.Rather, these operations are machine operations. Useful machines forperforming operations of various embodiments include general purposedigital computers as selectively activated or configured by a computerprogram stored within that is written in accordance with the teachingshereby, and/or include apparatus specially constructed for the requiredpurpose. Various embodiments also relate to apparatus or systems forperforming these operations. These apparatuses may be speciallyconstructed for the required purpose or may include a general-purposecomputer. The required structure for a variety of these machines will beapparent from the description given.

Reference is now made to the drawings, whereby like reference numeralsare used to refer to like elements throughout. In the followingdescription, for purpose of explanation, numerous specific details areset forth in order to provide a thorough understanding thereof. It maybe evident, however, that the novel embodiments can be practiced withoutthese specific details. In other instances, well known structures anddevices are shown in block diagram form to facilitate a descriptionthereof. The intention is to cover all modification, equivalents, andalternatives within the scope of the claims.

FIG. 1 illustrates an exemplary operating environment 100 for a machinelearning (ML) integrator 104 according to one or more embodimentsdescribed hereby. Operating environment 100 may include ML integrator104 in conjunction with one or more data sources 102, an ML framework106, and one or more target environments. The ML integrator 104 mayutilize ML framework 106 to generate ML models based on information fromdata sources 102. The ML integrator 104 may then make the ML modelsavailable for making inferences in the one or more target environments108. In many embodiments, target environments 108 may include one ormore portions of a computing platform, such as a content serviceplatform. In various embodiments, the content service platform mayinclude a variety of process and technologies that supports thecollecting, transforming, managing, analyzing, and publishing ofinformation (e.g., digital content). Accordingly, ML integrator 104 mayoperate to implement ML techniques in the collection, management,analysis, and/or publishing of content to gain valuable insights andefficiencies. For example, ML integrator 104 may enable ML models to beused in a target environment to suggest labels characterizing images ofnew products stored in a content repository. Embodiments are not limitedin this context.

In various embodiments, the ML integrator 104 may obtain source datafrom the one or more data sources 102, produce training data and/orevaluation data for input into the ML framework 106 for generation of MLmodels. For example, ML integrator 104 may migrate, transform, and/orclean data from data sources 102 prior to providing it to ML framework106 for training models. In some embodiments, ML integrator 104 may beutilized to select previously existing data for providing to MLframework 106. For example, the previously existing data may have beengenerated from prior migrating, transforming, and/or cleaning operationsperformed by the ML integrator. In another example, the previouslyexisting data may include third-party training data. Further, the MLintegrator may selective make ML models generated by ML framework 106available in one or more of the target environments, such as based onlabels and statuses applied to different models and model versions. Aswill be appreciated reference to training data or evaluation data hereinmay implicitly include corresponding evaluation data or training data aswell as any other data needed as input by the ML framework 106.

In many embodiments, ML integrator 104 may integrate one or more of thedata sources 102 with one or more of the target environments 108. Forexample, data sources 102 may include a production repository accessiblevia one or more of the target environments 108 (e.g., a content serviceplatform). In such examples, the ML integrator 104 may utilize theproduction repository in the CSP as a data source 102. In manyembodiments, utilizing the production repository as a data source 102may enable models to be updated frequently and efficiently based on themost recent data. Further, one or more components of the ML integrator104 and/or ML framework 106 may be integrated with one or more of thetarget environments 108.

In several embodiments, ML integrator 104 automates and/or providesguidance for various functionalities and techniques described hereby toreduce the need for manual and/or expert input in generating,integrating, and updating training data or ML models. In manyembodiments, ML integrator 104 may track operation of various componentsof the ML integrator and/or components connected thereto. In someembodiments, ML 104 may generate a historical index of operationalparameters and metrics of the components. In some such embodiments, thehistorical index may be used to identify, diagnose, repair issues. Forexample, the historical index may allow reversion to a prior stableconfiguration. In another example, configurations of select componentsmay be restored to previous version at one or more points in time.

FIG. 2 illustrates exemplary aspects of an ML integrator 204 inenvironment 200 according to one or more embodiments described hereby.Environment 200 includes one or more data sources 202, one or moretarget environments 208, ML integrator 204, and an ML framework 206. Inthe illustrated embodiment, ML integrator includes content manager 210,implementation manager 212, interface system 216, and audit system 214,and ML framework 206 includes ML model builder 218 and feature extractor220. In some embodiments, environment 200 may include one or morecomponents that are the same or similar to one or more other componentsdescribed hereby. For example, ML integrator 204 may be the same orsimilar to ML integrator 104. Embodiments are not limited in thiscontext.

Generally, the content manager 210 may embody functionalities of the MLintegrator associated with providing inputs to the ML framework 206, theimplementation manager 212 may embody functionalities of the MLintegrator 204 associated with making the outputs of the ML framework206 available for making inferences via the one or more targetenvironments 208, and the interface system 216 may embodyfunctionalities of the ML integrator 204 associated with obtaining userinput for determining which inferences to make. For example, the contentmanager 210 may produce training data to ML framework 206. In response,the ML framework may produce an ML model. The implementation manager 212may utilize the ML model to make inferences available via an applicationprogramming interface (API), and the interface system 216 may provide amenu in a graphical user interface of the target environment.

In several embodiments, ML integrator 204 may utilize ML framework 206to generate ML models based on information from data sources 202. Forexample, content manager 210 may utilize data from data sources 202 toproduce training data for providing to ML model builder 218 and/orfeature extractor 220 of ML framework 206. In many such examples, thedata from data sources 202 includes production content that isaccessible via the target environment. In many embodiments, modelsproduced by the ML framework 206 be made available for making inferencesin the one or more target environments 208. For example, implementationmanager 212 may deploy and/or host one or more of the ML models to makethem available for inferences in one or more of the target environments208. In various embodiments, implementation manager 212 may integrate aproduction repository to where a corresponding model is deployed.

In various embodiments, features and functionalities of ML integrator204 may be accessible via interface system 216. In many suchembodiments, interface system 216 is accessible via, or integrated with,one or more of the target environments 208. More generally, one or morecomponents of the ML integrator 204 may be integrated with one or moreof the target environments 208. In several embodiments, one or more ofthe data sources 202 may be integrated with one or more of the targetenvironments 208.

In many embodiments, the interface system 216 provide users with anintuitive experience that enable users to derive value from ML byproviding a uniform and familiar interface. Further, clean interfaceswith displayed functionalities being limited to those relevant to acurrent objective. In many embodiments, the interface system 216 mayguide users through the operation and configuration of various aspectsof the ML integrator 204, data sources 202, ML framework 206, and/ortarget environments 208 including analysis, interpretation, and/orresolution of associated data, issues, and results. Additional userinterface and experience features may be relevant to aspects of theembodiments herein, as described in more detail in U.S. PatentApplication, filed even date herewith, and titled “Techniques forIntuitive Machine Learning Development and Optimization,” the entiretyof which application is incorporated by reference herein.

In some embodiments, the interface system 216 may allow a bulk action tobe run on lists of documents (e.g., from search, folder, collection in atarget environment) that asks one or more ML models for predictions. Insuch examples, the results may be filled in depending on thresholds thatare configurable via the interface system (e.g., confidence levelthresholds). In yet another example, each time a form is accessed tocreate or edit content, suggestions for the new content mayautomatically be provided from model predictions. For example, interfacesystem 216 may enable ML models to be used in a target environment tosuggest labels characterizing images of a new fashion by season, color,photographer, talent. In another example, products captured in an imagemay be identified. The implementation manager 212 may perform theinference requested via the interface system 216. In some embodiments,click-to-add suggestions may be provided when an interface for enteringor modifying data is presented. Accordingly, in many embodiments, theinterface system 216 may determine to generate and provide inferenceswithout explicit request. Additionally, the interface system 216 mayallow control over one or more components or operational parameters ofthe data sources 202, ML integrator 204, ML framework 206, and/or targetenvironments 208. In many embodiments, the interface system 216 maycause one or more GUIs, or portions of a GUI, to be presented, such asvia target environments. In many such embodiments, user input isreceived while options, settings, metrics, and other output is presentedvia the GUIs or portions thereof. In one or more embodiments, interfacesystem 216 may provide one or more interfaces independent of any targetenvironments.

In several embodiments, the audit system 214 tracks operation of the MLintegrator 204 and/or components connected thereto (e.g., data sources202, ML framework 206, and target environments 208). For example, thelifecycle, usage, and performance data of a model can be audited,logged, and made available for analysis. Further, automatic actions fromprediction may be backtracked to a previous state. In another example,information, such as visual indications, may be provided to distinguishcontent generated by ML modes from content generated by manual input.Many embodiments allow bulk predictions to be applied in a safe way thatenables the scalability of AI to be tapped into without fear of costlymistakes.

FIG. 3 illustrates exemplary aspects of a content manager 310 inenvironment 300 according to one or more embodiments described hereby.Environment 300 includes one or more data sources 302, content manager310, and training data 334. In the illustrated embodiment, data sources302 include source data 322 with data items 324-1, 324-2, 324-n (or dataitems 324), and content manager 310 includes migrator 326, formatter328, normalizer 330, and filter 332. In various embodiments, the contentmanager 310 may generate training data 334 based on source data 322obtained from data sources 302. In some embodiments, environment 300 mayinclude one or more components that are the same or similar to one ormore other components described hereby. For example, content manager 310may be the same or similar to content manager 310. Embodiments are notlimited in this context.

In various embodiments, content manager 310 may obtain source data 322from data sources 302 and convert the source data 322 into training data334 for providing to an ML framework. Migrator 326 may obtain the sourcedata 322 via active or passive processes. For example, migrator 326 mayperiodically request source data 322 comprising new or modified datafrom one or more of the data sources 302. In another example, migrator326 may receive source data 322 from one or more of the data sources 302as a stream. In several embodiments, migrator 326 may obtain source data322 comprising production content. Data items 324 may include dataand/or metadata. In various embodiments, source data 322 is labeled (thelabels comprising metadata). In other embodiments, source data 322 isunlabeled.

Formatter 328 may reformat the source data 322 based on requirements forML techniques implemented by the ML framework. In several embodiments,formatter 328 handles converting data (e.g., content) into a renditionutilized by the ML framework. For example, formatter 328 may covertimages to a target size or pixel count, such as a standard size for adata set. In another example, the resolution of an image may be changedto a target resolution. In yet another example, pixel RGB content formatmay be an array. In another example, formatter 328 may convert data item324-1 from a first file type to a second file type (e.g., a document toan image). In some embodiments, items, such as binary text, may beextracted from data. In some such embodiments, binary text withpunctuation may be extracted from a portable document format (PDF) dataitem.

Normalizer 330 may adjust values in the source data 322 that aremeasured on different scales to a notionally common scale. For example,a first data item with labels rating a data characteristic (e.g., pixelbrightness, product durability, price, etc.) on a scale of 1-10 may benormalized with a second data item with labels rating the datacharacteristic on a scale of 1-5. Filter 332 may remove one or more dataitems 324 from source data 322. In several embodiments, filter 332 mayremove one or more data items based on one or more data metricsassociated with a ML framework (e.g., ML framework 206). For example,filter 332 may remove data items with an incompatible file format. Inanother example, filter 332 may remove data items failing a size orquality threshold.

In many embodiments, content manager 310 may continuously obtain new ormodified content via the migrator 326, identify one or morecorresponding ML models, and utilize the new or modified content togenerate training data 334 with one or more of formatter 328, normalizer330, and filter 332 to train, retrain, and/or optimize (e.g.,hyperparameter training) the one or more corresponding ML models. Insome embodiments, one or more operational parameters of the contentmanager 310, including components thereof, may be dynamically determinedbased on a corresponding or target ML model.

FIG. 4 illustrates exemplary aspects of an implementation manager 412 inenvironment 400 according to one or more embodiments described hereby.Environment 400 includes one or more ML models 436, implementationmanager 412, and target environments 408. In the illustrated embodiment,implementation manager 412 includes model manager 438 and productioncontent manager 442, and the one or more target environments 408 includeenvironments 446-1, 446-2, 446-n (or environments 446). In variousembodiments, the implementation manager 412 may selectively utilize theone or more ML models 436 to make inference services available to selectones of the target environments 408. In some embodiments, environment400 may include one or more components that are the same or similar toone or more other components described hereby. For example,implementation manager 412 may be the same or similar to implementationmanager 212. Embodiments are not limited in this context.

In various embodiments, the model manager 438 may provide one or more ofcustodial, administrative, deployment, hosting, and inferencefunctionalities to select environments 446 of target environments 408.For example, model manager 438 may provide a model repository thatincludes metadata for tracking characteristics of the models, such asstatuses of (active/inactive), correlations between, performancestatistics, version, instances, deployment history, and the like. Modelmanager 438 may allow models to be published with labels, being activeand available to the corresponding environment instances. Further, modelmanager 438 may deploy a new model for use via a target environment(e.g., an external infrastructure), such as for use by a contentrepository. In some embodiments, a model may be deployed based on astatus setting (e.g., active/inactive) corresponding to the modelversion. In various embodiments, deploying a model may include deployinga docker image with a tensor flow serve service with the model (e.g., ina zip file) inside. In several embodiments, model manager 438 mayautomatically orchestrate deploying models and/or configuring the modelfor deployment, such as on a CSP.

In many embodiments, the production content manager 442 may integrate aproduction repository to where a corresponding model is deployed. Invarious embodiments, integration may include registering/configuring amodel on a production instance so that the instance is aware of themodel and queries the model. For example, integration may include one ormore of routing queries to the appropriate endpoint (e.g., correct modelversion) where the model is deployed, consolidating query results, andsending the resulting predictions to the instance.

In some embodiments, the production content manager 442 may providesecurity to the production repository, such as by anonymizing orencrypting content. In many embodiments, communication utilizes secureAPIs. For example, APIs may be authenticated with and/or utilizehypertext transfer protocol secure (HTTPS).

FIG. 5 illustrates exemplary aspects of an audit system 514 inenvironment 500 according to one or more embodiments described hereby.Environment 500 includes audit system 514 and one or more audit targets556. In the illustrated embodiment, the audit system 514 includes auditmanager 548 and historical index 552, and the audit targets 556 includedata sources 502, ML framework 506, target environments 508, contentmanager 510, implementation manager 512, audit system 514, interfacesystem 516, training data 534, and ML models 536. In variousembodiments, audit system 514 may monitor and track operation of one ormore of the audit targets 556. In many such embodiments, the audittargets 556 may include various components of the ML integrator and/orcomponents connected thereto. In some embodiments, environment 500 mayinclude one or more components that are the same or similar to one ormore other components described hereby. For example, audit system 514may be the same or similar to audit system 214. Embodiments are notlimited in this context.

In various embodiments, the audit manager 548 may monitor and logactivities of the audit targets. In various such embodiments, the auditmanager 548 may generate historical index 552 based on the activitiesmonitored and logged. In some embodiments, the historical index 552 mayinclude a timeline of system states, component configurations, and/orperformance metrics. In several embodiments, the audit manager 518 mayenable interfacing with and restoring from the historical index. Forexample, audit manager 548 may enable a user to revert to a previousconfiguration of a component having issues. In some embodiments, auditmanager 548 may provide guidance on identifying, diagnosing, and/orrepairing issues. For example, the historical index 552 may allowreversion to a prior stable configuration. In another example,configurations of select components may be restored to previous versionat one or more points in time.

In several embodiments, the audit manager 548 may implement one or morediagnostic procedures or measurements and record the results in thehistorical index 552. In several such embodiments, the results may beutilized in identifying, diagnosing, and/or repairing issues, as well asproviding guidance through the identifying, diagnosing, and/orrepairing. Oftentimes, the audit system 514 is responsible for auditingand logging the lifecycle and usage of ML models and/or training data.various components. In some embodiments, audit system 514 may causeindications of whether data was generated by human or ML model to beprovided. In some such embodiments, the indications may be provided tothe interface system for presentation.

FIG. 6 illustrates a logic flow 600, in accordance with non-limitingexample(s) of the present disclosure. Logic flow 600 can begin at block602. At block 602 “obtain source data from one or more data sources, thesource data comprising a plurality of data items” source data comprisinga plurality of data items may be obtained from one or more data sources.For example, content manager 310 may obtain source data 322 comprisingdata items 324 from data sources 302.

Continuing to block 604 “produce training data based on the source data,wherein production of the training data includes removal of at least oneof the plurality of data items from the source data based on one or moredata metrics associated with a machine learning framework. For example,filter 332 may remove one or more of data items 324 from source data 322during production of training data 334 based on data metrics associatedwith ML framework 206.

Continuing to block 606 “provide the training data to the machinelearning framework to generate a model version” training data may beprovided to the ML framework for generation of a model version. Forinstance, training data 334 may be provided to ML framework 206 forgeneration of one of ML models 436. At block 608 “receive the modelversion from the machine learning framework” the model version may bereceived from the ML framework after generation. For example,implementation manager 412 may receive the model version from MLframework 106.

Continuing to block 610 “deploy the model version to a targetenvironment based on a status setting corresponding to the modelversion” the model version may be deployed to a target environment basedon corresponding status settings. For example, model manager 438 maydeploy one of ML models 436 to one or more of target environments 408based on a status setting associated with the one of ML models 436. Atblock 612 “integrate a production repository associated with the modelversion to the target environment” a production repository associatedwith the model version may be integrated with the target environment.For example, production content manager 442 may integrate a productionrepository to environment 446-1 based on an associated model versionbeing deployed to environment 446-1.

FIG. 7 illustrates an embodiment of a system 700 that may be suitablefor implementing various embodiments described hereby. System 700 is acomputing system with multiple processor cores such as a distributedcomputing system, supercomputer, high-performance computing system,computing cluster, mainframe computer, mini-computer, client-serversystem, personal computer (PC), workstation, server, portable computer,laptop computer, tablet computer, handheld device such as a personaldigital assistant (PDA), or other device for processing, displaying, ortransmitting information. Similar embodiments may comprise, e.g.,entertainment devices such as a portable music player or a portablevideo player, a smart phone or other cellular phone, a telephone, adigital video camera, a digital still camera, an external storagedevice, or the like. Further embodiments implement larger scale serverconfigurations. In other embodiments, the system 700 may have a singleprocessor with one core or more than one processor. Note that the term“processor” refers to a processor with a single core or a processorpackage with multiple processor cores. In at least one embodiment, thecomputing system 700 is representative of one or more componentsdescribed hereby, such as ML integrator 204, content manager 310,implementation manager 412, or audit system 514. More generally, thecomputing system 700 is configured to implement all logic, systems,logic flows, methods, apparatuses, and functionality described herebywith reference to FIGS. 1-6 . The embodiments are not limited in thiscontext.

As used in this application, the terms “system” and “component” and“module” are intended to refer to a computer-related entity, eitherhardware, a combination of hardware and software, software, or softwarein execution, examples of which are provided by the exemplary system700. For example, a component can be, but is not limited to being, aprocess running on a processor, a processor, a hard disk drive, multiplestorage drives (of optical, solid-state, and/or magnetic storagemedium), an object, an executable, a thread of execution, a program,and/or a computer. By way of illustration, both an application runningon a server and the server can be a component. One or more componentscan reside within a process and/or thread of execution, and a componentcan be localized on one computer and/or distributed between two or morecomputers. Further, components may be communicatively coupled to eachother by various types of communications media to coordinate operations.The coordination may involve the uni-directional or bi-directionalexchange of information. For instance, the components may communicateinformation in the form of signals communicated over the communicationsmedia. The information can be implemented as signals allocated tovarious signal lines. In such allocations, each message is a signal.Further embodiments, however, may alternatively employ data messages.Such data messages may be sent across various connections. Exemplaryconnections include parallel interfaces, serial interfaces, and businterfaces.

As shown in this figure, system 700 comprises a motherboard orsystem-on-chip (SoC) 702 for mounting platform components. Motherboardor system-on-chip (SoC) 702 is a point-to-point (P2P) interconnectplatform that includes a first processor 704 and a second processor 706coupled via a point-to-point interconnect 770 such as an Ultra PathInterconnect (UPI). In other embodiments, the system 700 may be ofanother bus architecture, such as a multi-drop bus. Furthermore, each ofprocessor 704 and processor 706 may be processor packages with multipleprocessor cores including core(s) 708 and core(s) 710, respectively.While the system 700 is an example of a two-socket (2S) platform, otherembodiments may include more than two sockets or one socket. Forexample, some embodiments may include a four-socket (4S) platform or aneight-socket (8S) platform. Each socket is a mount for a processor andmay have a socket identifier. Note that the term platform refers to themotherboard with certain components mounted such as the processor 704and chipset 732. Some platforms may include additional components andsome platforms may only include sockets to mount the processors and/orthe chipset. Furthermore, some platforms may not have sockets (e.g. SoC,or the like).

The processor 704 and processor 706 can be any of various commerciallyavailable processors, including without limitation an Intel® Celeron®,Core®, Core (2) Duo®, Itanium®, Pentium®, Xeon®, and XScale® processors;AMD® Athlon®, Duron® and Opteron® processors; ARM® application, embeddedand secure processors; IBM® and Motorola® DragonBall® and PowerPC®processors; IBM and Sony® Cell processors; and similar processors. Dualmicroprocessors, multi-core processors, and other multi-processorarchitectures may also be employed as the processor 704 and/or processor706. Additionally, the processor 704 need not be identical to processor706.

Processor 704 includes an integrated memory controller (IMC) 720 andpoint-to-point (P2P) interface 724 and P2P interface 728. Similarly, theprocessor 706 includes an IMC 722 as well as P2P interface 726 and P2Pinterface 730. IMC 720 and IMC 722 couple the processors processor 704and processor 706, respectively, to respective memories (e.g., memory716 and memory 718). Memory 716 and memory 718 may be portions of themain memory (e.g., a dynamic random-access memory (DRAM)) for theplatform such as double data rate type 3 (DDR3) or type 4 (DDR4)synchronous DRAM (SDRAM). In the present embodiment, the memory 716 andmemory 718 locally attach to the respective processors (i.e., processor704 and processor 706). In other embodiments, the main memory may couplewith the processors via a bus and shared memory hub.

System 700 includes chipset 732 coupled to processor 704 and processor706. Furthermore, chipset 732 can be coupled to storage device 750, forexample, via an interface (I/F) 738. The I/F 738 may be, for example, aPeripheral Component Interconnect-enhanced (PCI-e). Storage device 750can store instructions executable by circuitry of system 700 (e.g.,processor 704, processor 706, GPU 748, ML accelerator 754, visionprocessing unit 756, or the like). For example, storage device 750 canstore instructions for ML integrator 204, content manager 310,implementation manager 412, audit system 514, logic flow 600, or thelike.

Processor 704 couples to a chipset 732 via P2P interface 728 and P2P 734while processor 706 couples to a chipset 732 via P2P interface 730 andP2P 736. Direct media interface (DMI) 776 and DMI 778 may couple the P2Pinterface 728 and the P2P 734 and the P2P interface 730 and P2P 736,respectively. DMI 776 and DMI 778 may be a high-speed interconnect thatfacilitates, e.g., eight Giga Transfers per second (GT/s) such as DMI3.0. In other embodiments, the processor 704 and processor 706 mayinterconnect via a bus.

The chipset 732 may comprise a controller hub such as a platformcontroller hub (PCH). The chipset 732 may include a system clock toperform clocking functions and include interfaces for an I/O bus such asa universal serial bus (USB), peripheral component interconnects (PCIs),serial peripheral interconnects (SPIs), integrated interconnects (I2Cs),and the like, to facilitate connection of peripheral devices on theplatform. In other embodiments, the chipset 732 may comprise more thanone controller hub such as a chipset with a memory controller hub, agraphics controller hub, and an input/output (I/O) controller hub.

In the depicted example, chipset 732 couples with a trusted platformmodule (TPM) 744 and UEFI, BIOS, FLASH circuitry 746 via I/F 742. TheTPM 744 is a dedicated microcontroller designed to secure hardware byintegrating cryptographic keys into devices. The UEFI, BIOS, FLASHcircuitry 746 may provide pre-boot code.

Furthermore, chipset 732 includes the I/F 738 to couple chipset 732 witha high-performance graphics engine, such as, graphics processingcircuitry or a graphics processing unit (GPU) 748. In other embodiments,the system 700 may include a flexible display interface (FDI) (notshown) between the processor 704 and/or the processor 706 and thechipset 732. The FDI interconnects a graphics processor core in one ormore of processor 704 and/or processor 706 with the chipset 732.

Additionally, ML accelerator 754 and/or vision processing unit 756 canbe coupled to chipset 732 via I/F 738. ML accelerator 754 can becircuitry arranged to execute ML related operations (e.g., training,inference, etc.) for ML models. Likewise, vision processing unit 756 canbe circuitry arranged to execute vision processing specific or relatedoperations. In particular, ML accelerator 754 and/or vision processingunit 756 can be arranged to execute mathematical operations and/oroperands useful for machine learning, neural network processing,artificial intelligence, vision processing, etc.

Various I/O devices 760 and display 752 couple to the bus 772, alongwith a bus bridge 758 which couples the bus 772 to a second bus 774 andan I/F 740 that connects the bus 772 with the chipset 732. In oneembodiment, the second bus 774 may be a low pin count (LPC) bus. Variousdevices may couple to the second bus 774 including, for example, akeyboard 762, a mouse 764 and communication devices 766.

Furthermore, an audio I/O 768 may couple to second bus 774. Many of theI/O devices 760 and communication devices 766 may reside on themotherboard or system-on-chip (SoC) 702 while the keyboard 762 and themouse 764 may be add-on peripherals. In other embodiments, some or allthe I/O devices 760 and communication devices 766 are add-on peripheralsand do not reside on the motherboard or system-on-chip (SoC) 702.

FIG. 8 illustrates a block diagram of an exemplary communicationsarchitecture 800 suitable for implementing various embodiments aspreviously described, such as communications between ML integrator 104and data sources 102, ML framework 106, and/or target environments 108.The communications architecture 800 includes various commoncommunications elements, such as a transmitter, receiver, transceiver,radio, network interface, baseband processor, antenna, amplifiers,filters, power supplies, and so forth. The embodiments, however, are notlimited to implementation by the communications architecture 800.

As shown in FIG. 8 , the communications architecture 800 comprisesincludes one or more clients 802 and servers 804. In some embodiments,communications architecture may include or implement one or moreportions of components, applications, and/or techniques describedhereby. The clients 802 and the servers 804 are operatively connected toone or more respective client data stores 808 and server data stores 810that can be employed to store information local to the respectiveclients 802 and servers 804, such as cookies and/or associatedcontextual information. In various embodiments, any one of servers 804may implement one or more of logic flows or operations described hereby,such as in conjunction with storage of data received from any one ofclients 802 on any of server data stores 810. In one or moreembodiments, one or more of client data store(s) 808 or server datastore(s) 810 may include memory accessible to one or more portions ofcomponents, applications, and/or techniques described hereby.

The clients 802 and the servers 804 may communicate information betweeneach other using a communication framework 806. The communicationsframework 806 may implement any well-known communications techniques andprotocols. The communications framework 806 may be implemented as apacket-switched network (e.g., public networks such as the Internet,private networks such as an enterprise intranet, and so forth), acircuit-switched network (e.g., the public switched telephone network),or a combination of a packet-switched network and a circuit-switchednetwork (with suitable gateways and translators).

The communications framework 806 may implement various networkinterfaces arranged to accept, communicate, and connect to acommunications network. A network interface may be regarded as aspecialized form of an input output interface. Network interfaces mayemploy connection protocols including without limitation direct connect,Ethernet (e.g., thick, thin, twisted pair 10/100/1900 Base T, and thelike), token ring, wireless network interfaces, cellular networkinterfaces, IEEE 802.11a-x network interfaces, IEEE 802.16 networkinterfaces, IEEE 802.20 network interfaces, and the like. Further,multiple network interfaces may be used to engage with variouscommunications network types. For example, multiple network interfacesmay be employed to allow for the communication over broadcast,multicast, and unicast networks. Should processing requirements dictatea greater amount speed and capacity, distributed network controllerarchitectures may similarly be employed to pool, load balance, andotherwise increase the communicative bandwidth required by clients 802and the servers 804. A communications network may be any one and thecombination of wired and/or wireless networks including withoutlimitation a direct interconnection, a secured custom connection, aprivate network (e.g., an enterprise intranet), a public network (e.g.,the Internet), a Personal Area Network (PAN), a Local Area Network(LAN), a Metropolitan Area Network (MAN), an Operating Missions as Nodeson the Internet (OMNI), a Wide Area Network (WAN), a wireless network, acellular network, and other communications networks.

Various embodiments may be implemented using hardware elements, softwareelements, or a combination of both. Examples of hardware elements mayinclude processors, microprocessors, circuits, circuit elements (e.g.,transistors, resistors, capacitors, inductors, and so forth), integratedcircuits, application specific integrated circuits (ASIC), programmablelogic devices (PLD), digital signal processors (DSP), field programmablegate array (FPGA), logic gates, registers, semiconductor device, chips,microchips, chip sets, and so forth. Examples of software may includesoftware components, programs, applications, computer programs,application programs, system programs, machine programs, operatingsystem software, middleware, firmware, software modules, routines,subroutines, functions, methods, procedures, software interfaces,application program interfaces (API), instruction sets, computing code,computer code, code segments, computer code segments, words, values,symbols, or any combination thereof. Determining whether an embodimentis implemented using hardware elements and/or software elements may varyin accordance with any number of factors, such as desired computationalrate, power levels, heat tolerances, processing cycle budget, input datarates, output data rates, memory resources, data bus speeds and otherdesign or performance constraints.

One or more aspects of at least one embodiment may be implemented byrepresentative instructions stored on a machine-readable medium whichrepresents various logic within the processor, which when read by amachine causes the machine to fabricate logic to perform the techniquesdescribed hereby. Such representations, known as “IP cores” may bestored on a tangible, machine readable medium and supplied to variouscustomers or manufacturing facilities to load into the fabricationmachines that actually make the logic or processor. Some embodiments maybe implemented, for example, using a machine-readable medium or articlewhich may store an instruction or a set of instructions that, ifexecuted by a machine, may cause the machine to perform a method and/oroperations in accordance with the embodiments. Such a machine mayinclude, for example, any suitable processing platform, computingplatform, computing device, processing device, computing system,processing system, computer, processor, or the like, and may beimplemented using any suitable combination of hardware and/or software.The machine-readable medium or article may include, for example, anysuitable type of memory unit, memory device, memory article, memorymedium, storage device, storage article, storage medium and/or storageunit, for example, memory, removable or non-removable media, erasable ornon-erasable media, writeable or re-writeable media, digital or analogmedia, hard disk, floppy disk, Compact Disk Read Only Memory (CD-ROM),Compact Disk Recordable (CD-R), Compact Disk Rewriteable (CD-RW),optical disk, magnetic media, magneto-optical media, removable memorycards or disks, various types of Digital Versatile Disk (DVD), a tape, acassette, or the like. The instructions may include any suitable type ofcode, such as source code, compiled code, interpreted code, executablecode, static code, dynamic code, encrypted code, and the like,implemented using any suitable high-level, low-level, object-oriented,visual, compiled and/or interpreted programming language.

The foregoing description of example embodiments has been presented forthe purposes of illustration and description. It is not intended to beexhaustive or to limit the present disclosure to the precise formsdisclosed. Many modifications and variations are possible in light ofthis disclosure. It is intended that the scope of the present disclosurebe limited not by this detailed description, but rather by the claimsappended hereto. Future filed applications claiming priority to thisapplication may claim the disclosed subject matter in a different mannerand may generally include any set of one or more limitations asvariously disclosed or otherwise demonstrated hereby.

The invention claimed is:
 1. An apparatus, the apparatus comprising: aprocessor; and a memory comprising instructions that when executed bythe processor cause the processor to: obtain source data from one ormore data sources, the source data comprising a plurality of data items;produce training data based on the source data, wherein production ofthe training data includes removal of at least one of the plurality ofdata items from the source data based on one or more data metricsassociated with a machine learning framework; provide the training datato the machine learning framework to generate a model version; receivethe model version from the machine learning framework; deploy a dockerimage, with the model version, based on a status setting correspondingto the model version; deploy the docker image to a target environment todeploy the model version to the target environment; register the modelversion with an instance of a production repository to associate themodel version to the target environment; generate a log of activityassociated with the training data via an audit system that providesindications to distinguish content of machine language models frommanual input, wherein the log of activity is associated with deployingthe model version to the target environment based on the status settingcorresponding to the model version, and with integrating the productionrepository associated with the model version to the target environment;and restore the target environment and the one or more data sources to aprevious state based on the log of activity and a historical index. 2.The apparatus of claim 1, wherein the target environment includes acontent services platform.
 3. The apparatus of claim 2, wherein thetarget environment includes a content repository of the content servicesplatform.
 4. The apparatus of claim 1, the memory comprisinginstructions that when executed by the processor cause the processor togenerate the log of activity associated with one or more of: obtainingthe source data from the one or more data sources, producing thetraining data based on the source data, providing the training data tothe machine learning framework, and receiving the model version from themachine learning framework.
 5. The apparatus of claim 1, wherein eachdata item includes data and metadata corresponding to the data.
 6. Theapparatus of claim 1, wherein production of the training data includestransformation of at least one of the plurality of data items in thesource data from a format incompatible with the machine learningframework into a format compatible with the machine learning framework.7. The apparatus of claim 1, wherein the data metric comprises one ormore of a file type, a file size threshold, a quality threshold, and areliability indicator.
 8. The apparatus of claim 1, wherein productionof the training data includes normalizing the source data.
 9. Theapparatus of claim 1, wherein the log of activity associated with thetraining data via the audit system enables actions from a prediction tobe backtracked to the previous state.
 10. At least one non-transitorycomputer-readable medium comprising a set of instructions that, inresponse to being executed by a processor circuit, cause the processorcircuit to: obtain source data from one or more data sources, the sourcedata comprising a plurality of data items; produce training data basedon the source data, wherein production of the training data includesremoval of at least one of the plurality of data items from the sourcedata based on one or more data metrics associated with a machinelearning framework; provide the training data to the machine learningframework to generate a model version; receive the model version fromthe machine learning framework; deploy a docker image, with the modelversion, based on a status setting corresponding to the model version;deploy the docker image to a target environment to deploy the modelversion to the target environment; register the model version with aninstance of a production repository to associate the model version tothe target environment; generate a log of activity associated with thetraining data that provides indications to distinguish content ofmachine language modes from manual input; wherein the log of activity isassociated with deploying the model version to the target environmentbased on the status setting corresponding to the model version, and withintegrating the production repository associated with the model versionto the target environment; and restore the target environment and theone or more data sources to a previous state based on the log ofactivity and a historical index.
 11. The at least one non-transitorycomputer-readable medium of claim 10, wherein the target environmentincludes a content services platform.
 12. The at least onenon-transitory computer-readable medium of claim 11, wherein the targetenvironment includes a content repository of the content servicesplatform.
 13. The at least one non-transitory computer-readable mediumof claim 10, comprising instructions that, in response to being executedby the processor circuit cause the processor circuit to generate the logof activity associated with one or more of obtaining the source datafrom the one or more data sources, producing the training data based onthe source data, providing the training data to the machine learningframework, receiving the model version from the machine learningframework, deploying the model version to the target environment basedon the status setting corresponding to the model version, andintegrating the production repository associated with the model versionto the target environment.
 14. The at least one non-transitorycomputer-readable medium of claim 10, wherein each data item includesdata and metadata corresponding to the data.
 15. The at least onenon-transitory computer-readable medium of claim 10, wherein productionof the training data includes transformation of at least one of theplurality of data items in the source data from a format incompatiblewith the machine learning framework into a format compatible with themachine learning framework.
 16. The at least one non-transitorycomputer-readable medium of claim 10, wherein the log of activityassociated with the training data enables actions from a prediction tobe backtracked to the previous state.
 17. A computer-implemented method,comprising: obtaining source data from one or more data sources, thesource data comprising a plurality of data items; producing trainingdata based on the source data, wherein production of the training dataincludes removal of at least one of the plurality of data items from thesource data based on one or more data metrics associated with a machinelearning framework; providing the training data to the machine learningframework to generate a model version; receiving the model version fromthe machine learning framework; deploying a docker image, with the modelversion, based on a status setting corresponding to the model version;deploying the docker image to a target environment to deploy the modelversion to the target environment; registering the model version with aninstance of a production repository to associate the model version tothe target environment; generate a log of activity associated with thetraining data that provides indications to distinguish content ofmachine language modes from manual input, wherein the log of activity isassociated with deploying the model version to the target environmentbased on the status setting corresponding to the model version, and withintegrating the production repository associated with the model versionto the target environment; and restoring the target environment and theone or more data sources to a previous state based on the log ofactivity and a historical index.
 18. The computer-implemented method ofclaim 17, comprising generating the log of activity associated with oneor more of obtaining the source data from the one or more data sources,producing the training data based on the source data, providing thetraining data to the machine learning framework, receiving the modelversion from the machine learning framework, deploying the model versionto the target environment based on the status setting corresponding tothe model version, and integrating the production repository associatedwith the model version to the target environment.
 19. Thecomputer-implemented method of claim 17, wherein production of thetraining data includes transforming at least one of the plurality ofdata items in the source data from a format incompatible with themachine learning framework into a format compatible with the machinelearning framework.
 20. The computer-implemented method of claim 17,wherein the log of activity associated with the training data enablesactions from a prediction to be backtracked to the previous state.